6 research outputs found
Notes on the runtime of A* sampling
The challenge of simulating random variables is a central problem in
Statistics and Machine Learning. Given a tractable proposal distribution ,
from which we can draw exact samples, and a target distribution which is
absolutely continuous with respect to , the A* sampling algorithm allows
simulating exact samples from , provided we can evaluate the Radon-Nikodym
derivative of with respect to . Maddison et al. originally showed that
for a target distribution and proposal distribution , the runtime of A*
sampling is upper bounded by where
is the Renyi divergence from to . This runtime can be
prohibitively large for many cases of practical interest. Here, we show that
with additional restrictive assumptions on and , we can achieve much
faster runtimes. Specifically, we show that if and are distributions on
and their Radon-Nikodym derivative is unimodal, the runtime of A*
sampling is , which is exponentially faster than
A* sampling without assumptions
Environmental Sensor Placement with Convolutional Gaussian Neural Processes
Environmental sensors are crucial for monitoring weather conditions and the
impacts of climate change. However, it is challenging to maximise measurement
informativeness and place sensors efficiently, particularly in remote regions
like Antarctica. Probabilistic machine learning models can evaluate placement
informativeness by predicting the uncertainty reduction provided by a new
sensor. Gaussian process (GP) models are widely used for this purpose, but they
struggle with capturing complex non-stationary behaviour and scaling to large
datasets. This paper proposes using a convolutional Gaussian neural process
(ConvGNP) to address these issues. A ConvGNP uses neural networks to
parameterise a joint Gaussian distribution at arbitrary target locations,
enabling flexibility and scalability. Using simulated surface air temperature
anomaly over Antarctica as ground truth, the ConvGNP learns spatial and
seasonal non-stationarities, outperforming a non-stationary GP baseline. In a
simulated sensor placement experiment, the ConvGNP better predicts the
performance boost obtained from new observations than GP baselines, leading to
more informative sensor placements. We contrast our approach with physics-based
sensor placement methods and propose future work towards an operational sensor
placement recommendation system. This system could help to realise
environmental digital twins that actively direct measurement sampling to
improve the digital representation of reality.Comment: In review for the Climate Informatics 2023 special issue of
Environmental Data Scienc
Trieste: Efficiently Exploring The Depths of Black-box Functions with TensorFlow
We present Trieste, an open-source Python package for Bayesian optimization
and active learning benefiting from the scalability and efficiency of
TensorFlow. Our library enables the plug-and-play of popular TensorFlow-based
models within sequential decision-making loops, e.g. Gaussian processes from
GPflow or GPflux, or neural networks from Keras. This modular mindset is
central to the package and extends to our acquisition functions and the
internal dynamics of the decision-making loop, both of which can be tailored
and extended by researchers or engineers when tackling custom use cases.
Trieste is a research-friendly and production-ready toolkit backed by a
comprehensive test suite, extensive documentation, and available at
https://github.com/secondmind-labs/trieste
Efficient Gaussian Neural Processes for Regression
Conditional Neural Processes (CNP; Garnelo et al., 2018) are an attractive
family of meta-learning models which produce well-calibrated predictions,
enable fast inference at test time, and are trainable via a simple maximum
likelihood procedure. A limitation of CNPs is their inability to model
dependencies in the outputs. This significantly hurts predictive performance
and renders it impossible to draw coherent function samples, which limits the
applicability of CNPs in down-stream applications and decision making. Neural
Processes (NPs; Garnelo et al., 2018) attempt to alleviate this issue by using
latent variables, relying on these to model output dependencies, but introduces
difficulties stemming from approximate inference. One recent alternative
(Bruinsma et al., 2021), which we refer to as the FullConvGNP, models
dependencies in the predictions while still being trainable via exact
maximum-likelihood. Unfortunately, the FullConvGNP relies on expensive
2D-dimensional convolutions, which limit its applicability to only
one-dimensional data. In this work, we present an alternative way to model
output dependencies which also lends itself maximum likelihood training but,
unlike the FullConvGNP, can be scaled to two- and three-dimensional data. The
proposed models exhibit good performance in synthetic experiments
Practical Conditional Neural Processes Via Tractable Dependent Predictions
Conditional Neural Processes (CNPs; Garnelo et al., 2018a) are meta-learning
models which leverage the flexibility of deep learning to produce
well-calibrated predictions and naturally handle off-the-grid and missing data.
CNPs scale to large datasets and train with ease. Due to these features, CNPs
appear well-suited to tasks from environmental sciences or healthcare.
Unfortunately, CNPs do not produce correlated predictions, making them
fundamentally inappropriate for many estimation and decision making tasks.
Predicting heat waves or floods, for example, requires modelling dependencies
in temperature or precipitation over time and space. Existing approaches which
model output dependencies, such as Neural Processes (NPs; Garnelo et al.,
2018b) or the FullConvGNP (Bruinsma et al., 2021), are either complicated to
train or prohibitively expensive. What is needed is an approach which provides
dependent predictions, but is simple to train and computationally tractable. In
this work, we present a new class of Neural Process models that make correlated
predictions and support exact maximum likelihood training that is simple and
scalable. We extend the proposed models by using invertible output
transformations, to capture non-Gaussian output distributions. Our models can
be used in downstream estimation tasks which require dependent function
samples. By accounting for output dependencies, our models show improved
predictive performance on a range of experiments with synthetic and real data.Comment: 23 pages; accepted to the 10th International Conference on Learning
Representations (ICLR 2022